Chapter 5: Greedy Algorithms
    
        - 
            Greedy Algorithms
 
                - 
                    Chooses action that provides most immediate benefit.
                
 
- 
            Minimum Spanning Trees
 
                - 
                    Removing a cycle edge will not disconnect a graph.
                
- 
                    All spanning trees of graph \(G = (V, E) \) have \(|V|-1\) edges.
                
- 
                    Kruskal's Algorithm: repeatedly add lightest edge that doesn't create a cycle. 
 Reverse Kruskal's: repeatedly remove largest edge that doesn't disconnect the graph.
- 
                    Cut Property: If edge \(e\) is the lightest edge across cut \((S, V-S)\), then \(e\) is a
                    part of some MST.
                
- 
                    Cycle Property: If edge \(e\) had weight larger than the sum of the other edges in a cycle
                    it's part of, it cannot be part of an MST.
                
- 
                    Prim's Algorithm: start with empty set of edges, repeatedly find cuts and add the lighest edge
                    across the cut. 
 A straightforward way of obtaining cuts is 'growing' a tree \(T\). The cut is then \((T, V-T)\).
 
- 
            Union-Find
 
                - 
                    makeset(x) - make a singleton set of vertex x.
                
- 
                    find(x) - find root of x's component.
                
- 
                    union(x, y) - merge x and y's components.
                
- 
                    For any x, the rank of x's parent is larger than the rank of x.
                
- 
                    For any node of rank \(k\) there are at least \(2^k\) nodes in its subtree.
                
- 
                    If there are \(n\) elements, there can be at most \(\frac{n}{2^k}\) elements of rank \(k\).
                
- 
                    Path compression: during find(x), make root the parent of all nodes on path from x to root.
                
 
- 
            Huffman Encoding
 
                - 
                    Make each character into a node containing its frequency, heapify nodes.
                
- 
                    Cost of tree: \(\sum \limits_{i=1}^n f_i \cdot (\text{depth of }i\text{th symbol in tree)}\).
                
- 
                    Entropy: number of bits needed to encode sequence. 
 \(\sum \limits_{i=1}^n p_i \log(\frac{1}{p_i})\)
- 
                    more compressible = less random = more predictable
                
- 
                    If some character occurs with frequency more than \(\frac{2}{5}\), there is always a code of length
                    1. 
 If all characters occur with frequency less than \(\frac{1}{3}\), there is no code of length 1.
 Proof from Disc 5
                        Fa18
 
- 
            Horn Formulas
 
                - 
                    Consists of implications and negative clauses
                
- 
                    implications: conjunctions of literals implying another literal. 
 e.g. \((z \wedge x) \Longrightarrow y\)
- 
                    negative clauses: disjunctions of negative literals. 
 e.g. \((\bar x \vee \bar y)\)
- 
                    satisfying assignment: a set of assignments to the literals that satisfy all the clauses.
                
- 
                    Greedy Algorithm: start with all false, for all literals, set right hand side to true if left is
                    true. 
 If assignment satisfies negative clauses, return assignment, else return false.
- 
                    Algorithm assigns least number of variables to true / variables that algorithm sets to true must be
                    true in any satisfying assignment. 
 Proof: algorithm only assigns variables to true if LHS of implication is true.
- 
                    Exists linear time algorithm.
                
 
- 
            Distance from Optimality
 
                - 
                    Example: Set Cover
                
- 
                    Greedy: repeatedly choose set \(S\) with largest number of uncovered elements.
                
- 
                    Optimal uses \(k\) sets, so after any number of iterations, the remaining vertices can be covered
                    with \(k\) sets.
                
- 
                    There is a set of at least \(n/k\) vertices, \(n = \) number of remaining vertices. Hence greedy
                    leaves at
                    most \(n(1-\frac{1}{k})\) vertices.
                
- 
                    Since \(1-x \leq e^{-x}\), \(n_t \leq n_0(1-\frac{1}{k})^t \leq n_0(e^{-1/k})^t = n_0e^{-t/k}\) 
 At \(t = k\ln n\), \(n_t < 1\). Hence greedy uses at most \(k \ln n\) sets.
 
- 
            Exchange Argument
 
                - 
                    Assume exists optimal solution with lower cost.
                
- 
                    Since lower cost, optimal solution \(O\) and greedy solution \(A\) must differ.
                
- 
                    Change \(O\) to make it more similar to \(A\) show that the change does not make the solution worse.
                
- 
                    Since any exchanges do not make \(O\) worse, there is no optimal solution with lower cost.